Three methods of intonation modeling
نویسندگان
چکیده
This paper compares di erent methods of generating intonation for an American English Text-to-Speech synthesis system. We look at a primarily rule-based approach and two data-driven approaches. For data-driven modeling we used two separate data sets, each representing a somewhat di erent prosodic style. One database was recordings of a portion of 1989 Wall Street Journal text from the Penn Treebank Project. The second database was recordings of interactive prompts used in telephone network services. Both were read by the same female speaker. Approximately two and one-half hours of speech was phonetically and prosodically segmented and labeled ( rst automatically, and subsequently veri ed manually). The prosodic labeling used ToBI [7] tones and breaks. Three di erent intonation models were compared: (1) a predominantly rule-based model based on ToBI labels [3]; (2) a parametric model using the Tilt approach [8]; and (3) a Vector Quantized model based on an underlying parametric representation [5]. Sentences representative of both prosodic styles were synthesized with each of these models, and were presented to listeners for subjective ratings in a formal listening test. The results of the evaluation are reported.
منابع مشابه
Parametric modeling of intonation using vector
In this study we propose a data-based approach to intonation modeling using vector quantization. The model is based on an F0 parametrization with an especially designed approximation function. The parameter vectors found are vector quantized with varying codebook sizes. This method is motivated by intonation theories that suggest that pitch accent and boundary phenomena can be described by a di...
متن کاملParametric modeling of intonation using vector quantization
In this study we propose a data-based approach to intonation modeling using vector quantization. The model is based on an F0 parametrization with an especially designed approximation function. The parameter vectors found are vector quantized with varying codebook sizes. This method is motivated by intonation theories that suggest that pitch accent and boundary phenomena can be described by a di...
متن کاملImplications of Prosody Modeling for Prosody Recognition
This paper introduces Stem-ML, which is a model of the prosody generation process with an associated description language, and suggests how it may help prosody recognition. We applied Stem-ML modeling to three topics: the modeling of prosodic strengths, intonation types, and noun phrase patterns. Stem-ML parameters derived from )&* contours may have a more consistent relationship with prosodic ...
متن کاملModeling Broad Context for Tone Recognition with Conditional Random Fields
We propose a tone recognition approach that employs linearchain Conditional Random Fields (CRF) to model tone variation due to intonation effects. We implement three linearchain CRFs which aim at modeling intonation effects at phrasesentenceand story-level boundaries, where we show that standard recognition techniques degrade and common normalization approaches do not improve. We show that all ...
متن کاملMechanisms of Question Intonation in Mandarin
This study investigates mechanisms of question intonation in Mandarin Chinese. Three mechanisms of question intonation have been proposed: an overall higher phrase curve, higher strengths of sentence final tones, and a tone-dependent mechanism that flattens the falling slope of the final falling tone and steepens the rising slope of the final rising tone. The phrase curve and strength mechanism...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998